An Incremental Algorithm to find Asymmetric Word Similarities for Fuzzy Text Mining
نویسندگان
چکیده
Synonymy – different words with the same meaning – is a major problem for text mining systems. We have proposed asymmetric word similarities as a possible solution to this problem, where the similarity between words is computed on the basis of the similarities between contexts in which the words appear, rather than on their syntactic identity. In this paper, we give details of an incremental algorithm to compute word similarities and outline some tests which show the method’s effectiveness.
منابع مشابه
خوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملEXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS
Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...
متن کاملWord Similarity for Document Grouping using Soft Computing
The technology world has provided a more efficient and quicker way of accessing information through the web and databases in organizations that implement information systems in order to achieve a competitive edge. The simplest way of filtering information is to extract keywords in measuring the documents relevance. Nonetheless, getting to the right document is often a problem. Synonymy i.e., tw...
متن کامل